{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "👉 If for the entire FineTuning project I wanted to use the HuggingFace Training API to train this model, which will simplify the training and evaluation process considerably, the below is the guideline workflow.\n",
    "\n",
    "( NOTE - THIS IS JUST EXAMPLE CODE FOR GUIDELINE - IT WILL NOT RUN FULLY )\n",
    "\n",
    "👉 The HuggingFace Training API centers around the Trainer and TrainingArguments classes. The `Trainer` class encapsulates the training loop and handles everything from model training, evaluation, and saving. `TrainingArguments` allows for easy configuration of the training process.\n",
    "\n",
    "👉 To use the `Trainer`, we need to implement a `Dataset` class (which we've already done) and a model that inherits from `nn.Module` and has a forward method that returns a dictionary (our `SentimentClassifier` is fine, with a slight adjustment).\n",
    "\n",
    "👉 Let's outline the steps to leverage the HuggingFace Training API:\n",
    "\n",
    "👉 We will first initialize the training and evaluation data loaders. We'll continue to use our `create_data_loader` method, which already returns DataLoader objects that we can pass to HuggingFace's Trainer.\n",
    "\n",
    "```python\n",
    "train_data_loader = create_data_loader(df_train, tokenizer, max_len = MAX_LEN, batch_size = BATCH_SIZE)\n",
    "eval_data_loader = create_data_loader(df_eval, tokenizer, max_len = MAX_LEN, batch_size = BATCH_SIZE)\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "👉 Next, we will make a minor adjustment to the `SentimentClassifier` model's `forward` method. The HuggingFace `Trainer` expects the `forward` method to return a dictionary. We should return both the `loss` and `logits` in the `forward` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def forward(self, input_ids, attention_mask, labels=None):\n",
    "    _, pooled_output = self.bert(input_ids = input_ids, attention_mask= attention_mask)\n",
    "    output = self.drop(pooled_output)\n",
    "    logits = self.out(output)\n",
    "    loss = None\n",
    "    if labels is not None:\n",
    "        loss_fct = nn.CrossEntropyLoss()\n",
    "        loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))\n",
    "    return {\"loss\": loss, \"logits\": logits}\n",
    "\n",
    "\n",
    "# 👉 We will then instantiate the `TrainingArguments` and the `Trainer`. \n",
    "# The `TrainingArguments` define the set of hyperparameters that we will use for training. \n",
    "# We use the same hyperparameters as in our code:\n",
    "\n",
    "from transformers import Trainer, TrainingArguments\n",
    "\n",
    "training_args = TrainingArguments(\n",
    "    output_dir='./results',\n",
    "    num_train_epochs=3,\n",
    "    per_device_train_batch_size=BATCH_SIZE,\n",
    "    warmup_steps=500,\n",
    "    weight_decay=0.01,\n",
    "    logging_dir='./logs',\n",
    "    logging_steps=10,\n",
    "    load_best_model_at_end=True,\n",
    "    evaluation_strategy='epoch', # evaluates the model at the end of each epoch\n",
    ")\n",
    "\n",
    "model = SentimentClassifier(n_classes=5)  # assuming there are 5 classes\n",
    "\n",
    "trainer = Trainer(\n",
    "    model=model,\n",
    "    args=training_args,\n",
    "    train_dataset=train_data_loader.dataset,\n",
    "    eval_dataset=eval_data_loader.dataset,\n",
    "    data_collator=collator,  # the data collator we defined\n",
    "    tokenizer=tokenizer,  # the tokenizer we defined\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "👉 We can then call `trainer.train()` to start training and `trainer.evaluate()` to evaluate the model after training:\n",
    "\n",
    "```python\n",
    "trainer.train()\n",
    "trainer.evaluate()\n",
    "```\n",
    "\n",
    "👉 In our current code, we manually implement the training loop, including gradient accumulation, gradient clipping, and optimizer stepping, etc. With the HuggingFace `Trainer`, all these are handled for we. Moreover, we get additional features like gradient accumulation, mixed precision training, tensorboard logging, and several others, without having to implement these features manually.\n",
    "\n",
    "👉 Keep in mind that although this method can significantly simplify the training process, it may not be as flexible as manually controlling the training loop, especially for complex training logic or when we have to interleave training and evaluation steps in a specific way."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
